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Abstract 

In the fc-2VC problem, we are given an undirected graph G with edge costs and an integer k; the goal 
is to find a minimum-cost 2-vertex-connected subgraph of G containing at least k vertices. A slightly 
more general version is obtained if the input also specifies a subset S C V of terminals and the goal is 
to find a subgraph containing at least k terminals. Closely related to the k-2YC problem, and in fact a 
special case of it, is the fc-2EC problem, in which the goal is to find a minimum-cost 2-edge-connected 
subgraph containing k vertices. The fc-2EC problem was introduced by Lau et al. ||22| . who also gave 
a poly-logarithmic approximation for it. No previous approximation algorithm was known for the more 
general fe-2 VC problem. We describe an 0(log n ■ log k) approximation for the fc-2 VC problem. 

1 Introduction 

Connectivity and network design problems play an important role in combinatorial optimization and algorithms 
both for their theoretical appeal and their many real-world applications. An interesting and large class of 
problems are of the following type: given a graph G{V, E) with edge or node costs, find a minimum-cost 
subgraph H of G that satisfies certain connectivity properties. For example, given an integer A > 0, one can 
ask for the minimum-cost spanning subgraph that is A-edge or A-vertex connected. If A = 1 then this is the 
classical minimum spanning tree (MST) problem. For A > 1 the problem is NP-hard and also APX-hard to 
approximate. More general versions of connectivity problems are obtained if one seeks a subgraph in which a 
subset of the nodes S" C F referred to as terminals are A-connected. The well-known Steiner tree problem is 
to find a minimum-cost subgraph that (l-)connects a given set S. Many of these problems are special cases of 
the survivable network design problem (SNDP). In SNDP, each pair of nodes u,v specifies a connectivity 
requirement r{u,v) and the goal is to find a minimum-cost subgraph that has r(u, u) disjoint paths for each 
pair u,v. Given the intractability of these connectivity problems, there has been a large amount of work on 
approximation algorithms. A number of elegant and powerful techniques and results have been developed over 
the years (see |[T9l l25l ). In particular, the primal-dual method [1, 17] and iterated rounding [20] have led to 
some remarkable results including a 2-approximation for edge-connectivity SNDP |20|. 

An interesting class of problems, related to some of the connectivity problems described above, is obtained 
by requiring that only k of the given terminals be connected. These problems are partly motivated by applica- 
tions in which one seeks to maximize profit given a upper bound (budget) on the cost. For example, a useful 
problem in vehicle routing applications is to find a path that maximizes the number of vertices in it subject to a 
budget B on the length of the path. In the exact optimization setting, the profit maximization problem is equiv- 
alent to the problem of minimizing the cost/length of a path subject to the constraint that at least k vertices are 
included. Of course the two versions need not be approximation equivalent, nevertheless, understanding one 
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is often fruitful or necessary to understand the other. The most well-studied of these problems is the fc-MST 
problem; the goal here is to find a minimum-cost subgraph of the given graph G that contains at least k vertices 
(or terminals). This problem has attracted considerable attention in the approximation algorithms literature and 
its study has led to several new algorithmic ideas and applications |!3][T5jtl4,^4J- We note that the Steiner tree 
problem can be relatively easily reduced in an approximation preserving fashion to the fc-MST problem. More 
recently, Lau et al. [ 22] considered the natural generalization of A;-MST to higher connectivity. In particular 
they defined the (fc, A)-subgraph problem to be the following: find a minimum-cost subgraph of the given graph 
G that contains at least k vertices and is A-edge connected. We use the notation A;-AEC to refer to this problem. 
In |[22l an 0(log^ k) approximation was claimed for the fc-2EC problem. However, the algorithm and proof in 
II22I are incorrect. More recently, and in independent work from ours, the authors of ll22l obtained a different 
algorithm for k-2EC that yields an 0(lognlog A;) approximation. We give later a more detailed comparison 
between their approach and ours. It is also shown in [22] that a good approximation for k-XEC when A is 
large would yield an improved algorithm for the fc-densest subgraph problem [12]; in this problem one seeks a 
A:-vertex subgraph of a given graph G that has the maximum number of edges. The /c-densest subgraph problem 
admits an 0{n^) approximation for some fixed constant 5 < 1/3 fl2\ . but has resisted attempts at an improved 
approximation for a number of years now. 

In this paper we consider the vertex-connectivity generalization of the /c-MST problem. We define the 
k-XYC problem as follows: Given an integer k and a graph G with edge costs, find the minimum-cost A- 
vertex-connected subgraph of G that contains at least k vertices. We also consider the terminal version of the 
problem where the subgraph has to contain k terminals from a given terminal set 5 C y. It can be easily 
shown that the fc-AEC problem reduces to the k-XYC problem for any k > 1. We also observe that the 
k-XEC problem with terminals can be easily reduced, as follows, to the uniform problem where every vertex is 
a terminal: For each terminal v £ S, create n dummy vertices vi,V2, ■ ■ ■ and attach Vi to v with A parallel 
edges of zero cost. Now set k' = kn in the new graph. One can avoid using parallel edges by creating a 
clique on vi,V2, ■ ■ ■ ,Vn using zero-cost edges and connecting A of these vertices to v. Note, however, that this 
reduction only works for edge-connectivity. We are not aware of a reduction that reduces the k-XYC problem 
with a given set of terminals to the k-XYC problem, even when A = 2. In this paper we consider the k-2YC 
problem; our main result is the following. 

Theorem 1.1. There is an 0(log^ • logA;) approximation for the k-2WC problem where £ is the number of 
terminals. 

Corollary 1.2. There is an 0(log^ • log A;) approximation for the k-2EC problem where i is the number of 
terminals. 

One of the technical ingredients that we develop is the theorem below which may be of independent interest. 
Given a graph G with edge costs and weights on terminals 5 C F, we define density{H) for a subgraph H to 
be the ratio of the cost of edges in H to the total weight of terminals in H. 

Theorem 1.3. Let G be an 2-vertex-connected graph with edge costs and let S V be a set of terminals. 
Then, there is a simple cycle G containing at least 2 terminals (a non-trivial cycle) such that the density ofG is 
at most the density of G. Moreover, such a cycle can be found in polynomial time. 

Using the above theorem and an LP approach we obtain the following. 

Corollary 1.4. Given a graph G{V^ E) with edge costs and I terminals S , there is an O(log^) approxi- 
mation for the problem of finding a minimum- density non-trivial cycle. 

Note that Theorem 1 1.3 1 and Corollary 1 1.4l are of interest because we seek a cycle with at least two terminals. 
A minimum-density cycle containing only one terminal can be found by using the well-known min-mean cycle 
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algorithm in directed graphs IH. We remark, however, that although we suspect that the problem of finding a 
minimum-density non-trivial cycle is NP-hard, we currently do not have a proof. Theorem 11.31 shows that the 
problem is equivalent to the dens-2VC problem, defined in the next section. 

Remark: The reader may wonder whether k-2EC or k-2YC admit a constant factor approximation, since the 
A;-MST problem admits one. We note that the main technical tool which underlies 0(1) approximations for 
/c-MST problem lU [151 [HI is a special property that holds for a LP relaxation of the prize-collection Steiner 
tree problem |[T7l which is a Lagrangian relaxation of the Steiner tree problem. Such a property is not known to 
hold for generalizations of fc-MST including fc-2EC and k-2YC and the /c-Steiner forest problem |[T8l . Thus, 
one is forced to rely on alternative and problem-specific techniques. 

1.1 Overview of Technical Ideas 

We consider the rooted version of k-2NC : the goal is to find a min-cost subgraph that 2-connects at least k 
terminals to a specified root vertex r. It is relatively straightforward to reduce k-2'VC to its rooted version (see 
section [2] for details.) We draw inspiration from algorithmic ideas that led to poly-logarithmic approximations 
for the fc-MST problem. 

To describe our approach to the rooted k-2YC problem, we define a closely related problem. For a subgraph 
H that contains r, let k{H) be the number of terminals that are 2-connected to r in H. Then the density of 
H is simply the ratio of the cost of H to k{H). The dens-2VC problem is to find a 2-connected subgraph of 
minimum density. An Oilogt) approximation for the dens-2VC problem (where I is the number of terminals) 
can be derived in a some what standard way by using a bucketing and scaling trick on a linear programming 
relaxation for the problem. We exploit the known bound of 2 on the integrality gap of a natural LP for the 
SNDP problem with vertex connectivity requirements in {0, 1, 2} |[T3l . The bucketing and scaling trick has 
seen several uses in the past and has recently been highlighted in several applications IlllHlTl. 

Our algorithm for k-2YC uses a greedy approach at the high level. We start with an empty subgraph G' 
and use the approximation algorithm for dens-2VC in an iterative fashion to greedily add terminals to G' until 
at least k' > k terminals are in G' . This approach would yield an 0(log^logA;) approximation if k' = 0{k). 
However, the last iteration of the dens-2VC algorithm may add many more terminals than desired with the 
result that k' » k. In this case we cannot bound the quality of the solution obtained by the algorithm. To 
overcome this problem, one can try to prune the subgraph H added in the last iteration to only have the desired 
number of terminals. For the fc-MST problem, i/ is a tree and pruning is quite easy. We remark that this yields 
a rather straightforward 0(log n log k) approximation for fc-MST and could have been discovered much before 
a more clever analysis given in |[3|. 

One of our technical contributions is to give a pruning step for the fe-2VC problem. To accomplish this, we 
use two algorithmic ideas. The first is encapsulated in the cycle finding algorithm of Theorem 1 1.3 1 Second, we 
use this cycle finding algorithm to repeatedly merge subgraphs until we get the desired number of terminals in 
one subgraph. This latter step requires care. The cycle merging scheme is inspired by a similar approach from 
the work of Lau et al. |[22l on the /c-2EC problem and in |[TOll on the directed orienteering problem. These ideas 
yield an 0(log£ • log^ k) approximation. We give a slightly modified cycle-merging algorithm with a more 
sophisticated and non-trivial analysis to obtain an improved 0(log£ • log k) approximation. 

Some remarks are in order to compare our work to that of |[22l on the fc-2EC problem. The combinatorial 
algorithm in |[22l is based on finding a low-density cycle or a related structure called a bi-cycle. The algorithm 
in |[22]| to find such a structure is incorrect. Further, the cycles are contracted along the way which limits the 
approach to the A;-2EC problem (contracting a cycle in 2-node-connected graph may make the resulting graph 
not 2-node-connected). In our algorithm we do not contract cycles and instead introduce dummy terminals with 
weights to capture the number of terminals in an already formed component. This requires us to now address 
the minimum-density non-trivial simple cycle problem which we do via Theorem 11.31 and Corollary 11.41 In 
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independent work, Lau et al. |[23l obtain a new and correct 0(lognlog A;)-approximation for A;-2EC . They 
also follow the same approach that we do in using the LP for finding dense subgraphs followed by the pruning 
step. However, in the pruning step they use a completely different approach; they use the sophisticated idea of 
no-where zero 6-flows [24]. Although the use of this idea is elegant, the approach works only for the k-2EC 
problem, while our approach is less complex and leads to an algorithm for the more general k-2'VC problem. 

2 The Algorithm for the k-2\C Problem 

We work with graphs in which some vertices are designated as terminals. Given a graph G with edge costs 
and terminal weights, we define the density of a subgraph H to be sum of the costs of edges in H divided by 
the sum of the weights of terminals in H. Henceforth, we use 2-connected graph to mean a 2-vertex-connected 
graph. 

The goal of the k-2NC problem is to find a minimum-cost 2-connected subgraph on at least k terminals^] 
Recall that in the rooted k-2NC problem, the goal is to find a min-cost subgraph on at least k terminals in 
which every terminal is 2-connected to the specified root r. The (unrooted) k-2'VC problem can be reduced to 
the rooted version by guessing 2 vertices u, v that are in an optimal solution, creating a new root vertex r, and 
connecting it with 0-cost edges to u and v. It is not hard to show that any solution to the rooted problem in 
the modified graph can be converted to a solution to the unrooted problem by adding 2 minimum-cost vertex- 
disjoint paths between u and v. (Since u and v are in the optimal solution, the cost of these added paths cannot 
be more than OPT.) We omit further details from this extended abstract. 

In the dens-2VC problem, the goal is to find a subgraph H of minimum density in which all terminals of H 
are 2-connected to the root. The following lemma is proved in Section |2?T]below. It relies on a 2-approximation, 
via a natural LP, for the min-cost 2-connectivity problem due to Fleischer, Jain and Williamson |[T3l . and some 
standard techniques. 

Lemma 2.1. There is an O {log t)-approximation algorithm for the dens-2'VC problem, where I is the number 
of terminals in the given instance. 

Let OPT be the cost of an optimal solution to the k-2YC problem. We assume knowledge of OPT; this 
can be dispensed with using standard methods. We pre-process the graph by deleting any terminal that does not 
have 2 vertex-disjoint paths to the root r of total cost at most OPT. The high-level description of the algorithm 
for the rooted k-2YC problem is given below. 



k' ^ k, G' is the empty graph. 
While (k' > 0): 

Use the approximation algorithm for dens-2VC to find a subgraph H in G. 
lf{k{H) < k'y. 

G'^G'UH, k'^k'-k{H) 

Mark all terminals in H as non-terminals. 

Else: 

Prune H to obtain H' that contains k' terminals. 

G' = G'\J H', k' ^0 
Output G' 

'in fact, our algorithm solves the harder problem in which terminals have weights, and the goal is to find a minimum-cost 2- 
connected subgraph in which the sum of terminal weights is at least k. For simplicity of exposition, however, we stick to the more 
restricted version. 
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At the beginning of any iteration of the while loop, the graph contains a solution to the dens-2VC problem 
of density at most Therefore, the graph H returned always has density at most 0(log i)^^- If k{H) < k', 
we add H to G' and decrement k'; we refer to this as the augmentation step. Otherwise, we have a graph H of 
good density, but with too many terminals. In this case, we prune H to find a graph with the required number 
of terminals; this is the pruning step. A simple set-cover type argument shows the following lemma: 

Lemma 2.2. If, at every augmentation step, we add a graph of density at most 0{\ogt)^^^ (where k' is the 
number of additional terminals that must be selected), the total cost of all the augmentation steps is at most 
0(log^-log/c)OPT. 

Therefore, we now only have to bound the cost of the graph H' added in the pruning step; we prove the 
following theorem in Section ID 

Theorem 2.3. Let {G, k) be an instance of the rooted k-2NC problem with root r, such that every vertex ofG 
has 2 vertex-disjoint paths to r of total cost at most L, and such that density {G) < p. There is a polynomial-time 
algorithm to find a solution to this instance of cost at most 0(log k)pk + 2L. 

We can now prove our main result for the k-2YC problem, Theorem ll.il 

Proof of Theorem ll.lt Let OPT be the cost of an optimal solution to the (rooted) k-2'VC problem. By 
Lemma [Z2l the total cost of the augmentation steps of our greedy algorithm is 0(log £ ■ log /c)OPT. To bound 
the cost of the pruning step, let k' be the number of additional terminals that must be covered just prior to this 
step. The algorithm for the dens-2VC problem returns a graph H with k{H) > k' terminals, and density at 
most O (log tj^^y^. As a result of our pre-processing step, every vertex has 2 vertex-disjoint paths to r of total 
cost at most OPT. Now, we use Theorem l2.3l to prune H and find a graph H' with k' terminals and cost at most 
0{\ogk)density{H)k' + 20PT < 0{\ogi ■ log A;)OPT + 20PT. Therefore, the total cost of our solution is 
0(log£-logA;)OPT. □ 

It remains only to prove Lemma IZTl that there is an O (log ^) -approximation for the dens-2VC problem, 
and Theorem 12. 3 1 bounding the cost of the pruning step. We prove the former in Section |2?T] below. Before the 
latter is proved in Section HI we develop some tools in Section |3l chief among these tools is Theorem 1 1.3 1 

2.1 An 0(log -approximation for the dens-2VC problem 

Recall that the dens-2VC problem was defined as follows: Given a graph G{V, E) with edge-costs, a set 
r C y of terminals, and a root r e V{G), find a subgraph H of minimum density, in which every terminal of 
H is 2-connected to r. (Here, the density of H is defined as the cost of H divided by the number of terminals it 
contains, not including r.) We describe an algorithm for dens-2VC that gives an O (log -approximation, and 
sketch its proof. We use an LP based approach and a bucketing and scaling trick (see (T) HI |3 for applications 
of this idea), and a constant-factor bound on the integrality gap of an LP for SNDP with vertex-connectivity 
requirements in {0, 1, 2} 1,13] . 

We define LP-dens as the following LP relaxation of dens-2VC . For each terminal t, the variable yt 
indicates whether or not v is chosen in the solution. (By normalizing to 1, and minimizing the sum of 

edge costs, we minimize the density.) Ct is the set of all simple cycles containing t and the root r; for any 
C G Ct, fc indicates how much 'flow' is sent from v to r through C. (Note that a pair of vertex-disjoint paths 
is a cycle; the flow along a cycle is 1 if we can 2-connect t to r using the edges of the cycle.) The variable Xe 
indicates whether the edge e is used by the solution. 
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min c(e)xe 



ceCt 

fc<xe {yteT,eeE) 

ceCt\eec 

Xe,fc,yt > 

It is not hard to see that an optimal solution to LP-dens has cost at most the density of an optimal solution 
to dens-2VC . We now show how to obtain an integral solution of density at most 0(log^)OPTLP, where 
OPT LP is the cost of an optimal solution to LP-dens . The linear program LP-dens has an exponential number 
of variables but a polynomial number of non-trivial constraints; it can, however, be solved in polynomial time. 
Fix an optimal solution to LP-dens of cost OPT^p, and for each < i < 21og£ (for ease of notation, assume 
log^ is an integer), let Yi be the set of terminals t such that 2"^*"''^) < yt ^ 2~*. Since X^^g^^ yt = 1, there is 
some index i such that X^jgy — Titgl- ^^^^^ every terminal t G 1^ has yt < 2^*, the number of terminals 

in Yi is at least We claim that there is a subgraph H of G with cost at most 0(2*+^0PTip), in which 
every terminal of 1^ is 2-connected to the root. If this is true, the density of H is at most 0(log£ • OFTlp), 
and hence we have an O (log ^) -approximation for the dens-2VC problem. 

To prove our claim about the cost of the subgraph H in which every terminal of Yi is 2-connected to r, 
consider scaling up the given optimum solution of LP-dens by a factor of 2*+^. For each terminal t G Yi, the 
flow from t to r in this scaled solutioiHis at least 1, and the cost of the scaled solution is 2*+^0PTip. 

In |[T3l . the authors describe a linear program LPi to find a minimum-cost subgraph in which a given set of 
terminals is 2-connected to the root, and show that this linear program has an integrality gap of 2. The variables 
Xe in the 'scaled solution' to LP-dens correspond to a feasible solution of LPi with Yi as the set of terminals; 
the integrality gap of 2 implies that there is a subgraph H in which every terminal of Yi is 2-connected to the 
root, with cost at most 2*+^0PTlp. 

Therefore, the algorithm for dens-2VC is: 

1 . Find an optimal fractional solution to LP-dens . 

2. Find a set of terminals Yi such that YlteYi ^ 2 log £ - 

3. Find a min-cost subgraph H in which every terminal in Yi is 2-connected to r using the algorithm of |[T3l . 
H has density at most O(log^) times the optimal solution to dens-2VC . 



3 Finding Low-density Non-trivial Cycles 

A cycle C C G is non-trivial if it contains at least 2 terminals. We define the min-density non-trivial cycle 
problem: Given a graph G{V,E), with S <Z V marked as terminals, edge costs and terminal weights, find a 
minimum-density cycle that contains at least 2 terminals. Note that if we remove the requirement that the cycle 
be non-trivial (that is, it contains at least 2 terminals), the problem reduces to the min-mean cycle problem in 

"This is an abuse of the term 'solution', since after scaling, X^^gy yt = 2^'^^ 
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directed graphs, and can be solved exactly in polynomial time (see 111). Algorithms for the min-density non- 
trivial cycle problem are a useful tool for solving the k-2YC and k-2EC problems. In this section, we give an 
O (log ^) -approximation algorithm for the minimum-density non-trivial cycle problem. 

First, we prove Theorem 11.31 that a 2-connected graph with edge costs and terminal weights contains a 
simple non-trivial cycle, with density no more than the average density of the graph. We give two algorithms to 
find such a cycle; the first, described in Section [XT] is simpler, but the running time is not polynomial. A more 
technical proof that leads to a strongly polynomial-time algorithm is described in Section 13. 2[ we recommend 
this proof be skipped on a first reading. 

3.1 An Algorithm to Find Cycles of Average Density 

To find a non-trivial cycle of density at most that of the 2-connected input graph G, we will start with an 
arbitrary non-trivial cycle, and successively find cycles of better density until we obtain a cycle with density at 
most density{G). The following lemma shows that if a cycle C has an ear with density less than density{C), 
we can use this ear to find a cycle of lower density. 

Lemma 3.1. Let C be a non-trivial cycle, and H an ear incident to C at u and v, such that — f^fti^} — n- < 
density (C). Let Si and S2 be the two internally disjoint paths between u and v in C. Then HU Si and H U S2 
are both simple cycles and one of these is non-trivial and has density less than density (C). 

Proof. C has at least 2 terminals, so it has finite density; H must then have at least 1 terminal. Let ci, C2 and 
ch be, respectively, the sum of the costs of the edges in Si, S2 and H, and let wi, W2 and wh be the sum of the 
weights of the terminals in Si, S2 and H — {u, v}. 

Assume w.l.o.g. that Si has density at most that of 5*2. (That is, ci/wi < C2/w2-j^ Si must contain at 
least one terminal, and so U is a simple non-trivial cycle. The statement density {H U Si) < density (C) 
is equivalent to {ch + ci){wi + W2) < (ci + C2){wh + wi)- 

{ch + Ci){wi + W2) = CiWi + C1W2 + ch{wi + ^2) 

< ciwi + C2W1 + ch{wi + W2) {density{Si) < density{S2)) 

< ciwi + C2W1 + (ci + C2)wh {ch/wh < density{C)) 

= (Cl + C2){WH + Wl) 

Therefore, U 5i is a simple cycle containing at least 2 terminals of density less than density {C). □ 

Lemma 3.2. Given a cycle C in a 2-connected graph G, let G' be the graph formed from G by contracting G 
to a single vertex v.IfH is a connected component ofG' — v, H L) {v} is 2-connected in G'. 

Proof. Let H be an arbitrary connected component of G' — v, and let H' = H U {v}. To prove that H' is 
2-connected, we first observe that v is 2-connected to any vertex x £ H. (Any set that separates x from v in H' 
separates x from the cycle C in G.) 

It now follows that for all vertices x,y G y{H), x and y are 2-connected in H'. Suppose deleting some 
vertex u separates x from y. The vertex u cannot be v, since if is a connected component of G' — v. But if 
u ^ V, V and x are in the same component of H' — u, since v is 2-connected to x in H' . Similarly, v and y are 
in the same component of H' — u, and so deleting u does not separate x from y. □ 

We now show that given any 2-connected graph G, we can find a non-trivial cycle of density no more than 
that of G. 

'it is possible that one of Si and S2 has cost and weight 0. In this case, let Si be the component with non-zero weight. 
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Theorem 3.3. Let G be a 2-connected graph with at least 2 terminals. G contains a simple non-trivial cycle 
X such that density (X) < density (G). 

Proof. Let C be an arbitrary non-trivial simple cycle; such a cycle always exists since G is 2-connected and 
has at least 2 terminals. If density {C) > density {G), we give an algorithm that finds a new non-trivial cycle C' 
such that density {G') < density {G). Repeating this process, we obtain cycles of successively better densities 
until eventually finding a non-trivial cycle X of density at most density (G). 

Let G' be the graph formed by contracting the given cycle C to a single vertex v. In G', v is not a terminal, 
and so has weight 0. Consider the 2-connected components of G' (from Lemma [3^ each such component is 
formed by adding v to a. connected component of G' — v), and pick the one of minimum density. If H is this 
component, density {H) < density {G) by an averaging argument. 

H contains at least 1 terminal. If it contains 2 or more terminals, recursively find a non-trivial cycle C' 
in H such that density{G') < density{H) < density{G). If G' exists in the given graph G, it has the desired 
properties, and we are done. Otherwise, G' contains v, and the edges of G' form a ear of G in the original graph 
G. The density of this ear is less than the density of G, so we can apply Lemma [3?T] to obtain a non-trivial cycle 
in G that has density less than density{G). 

Finally, if H has exactly 1 terminal u, find any 2 vertex-disjoint paths using edges of H from u to distinct 
vertices in the cycle G. (Since G is 2-connected, there always exist such paths.) The cost of these paths is at 
most cost{H), and concatenating these 2 paths corresponds to a ear of G in G. The density of this ear is less 
than density{G); again, we use Lemma [3TT] to obtain a cycle in G with the desired properties. □ 

We remark again that the algorithm of Theorem l3.3l does not lead to a polynomial-time algorithm, even if all 
edge costs and terminal weights are polynomially bounded. In Section ll!2l we describe a strongly polynomial- 
time algorithm that, given a graph G, finds a non-trivial cycle of density at most that of G. Note that neither of 
these algorithms may directly give a good approximation to the min-density non-trivial cycle problem, because 
the optimal non-trivial cycle may have density much less than that of G. However, we can use Theorem [33] to 
prove the following theorem: 

Theorem 3.4. There is an a-approximation to the (unrooted) dens-2NC problem if and only if there is an 
a-approximation to the problem of finding a minimum-density non-trivial cycle. 

Proof. Assume we have a 7 (^) -approximation for the dens-2VC problem; we use it to find a low-density non- 
trivial cycle. Solve the dens-2VC problem on the given graph; since the optimal cycle is a 2-connected graph, 
our solution H to the dens-2VC problem has density at most 7(£) times the density of this cycle. Find a non- 
trivial cycle in H of density at most that of H; it has density at most 7(£) times that of an optimal non-trivial 
cycle. 

Note that any instance of the (unrooted) dens-2VC problem has an optimal solution that is a non-trivial 
cycle. (Consider any optimal solution H of density p; by Theorem 11.31 H contains a non-trivial cycle of 
density at most p. This cycle is a vaUd solution to the dens-2VC problem.) Therefore, a /?(£) -approximation 
for the min-density non-trivial cycle problem gives a -approximation for the dens-2VC problem. □ 

Theorem 13.41 and Lemma |2T] imply an O (log -approximation for the minimum-density non-trivial cycle 
problem; this proves Corollary 11.41 

We say that a graph G{V, E) is minimally 2-connected on its terminals if for every edge e ^ E, some pair 
of terminals is not 2-connected in the graph G — e. Section 13.21 shows that in any graph which is minimally 
2-connected on its terminals, every cycle is non-trivial. Therefore, the problem of finding a minimum-density 
non-trivial cycle in such graphs is just that of finding a minimum-density cycle, which can be solved exactly 
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Figure 1: is an earring of G, with clasps C4,CQ,cg; C4 is its first clasp, and cg its last clasp. The arrow 
indicates the arc of H. 

in polynomial time. However, as we explain at the end of the section, this does not directly lead to an efficient 
algorithm for arbitrary graphs. 

3.2 A Strongly Polynomial-time Algorithm to Find Cycles of Average Density 

In this section, we describe a strongly polynomial-time algorithm which, given a 2-connected graph G{V, E) 
with edge costs and terminal weights, finds a non-trivial cycle of density at most that of G. 

We begin with several definitions: Let C be a cycle in a graph G, and G' be the graph formed by deleting 
C from G. Let Hi,H2, ■ ■ ■ Hm be the connected components of G'; we refer to these as earrings of C0 For 
each Hi, let the vertices of G incident to it be called its clasps. From the definition of an earring, for any pair 
of clasps of Hi, there is a path between them whose internal vertices are all in Hi. 

We say that a vertex of C is an anchor if it is the clasp of some earring. (An anchor may be a clasp of 
multiple earrings.) A segment S" of C is a path contained in C, such that the endpoints of S are both anchors, 
and no internal vertex of S is an anchor. (Note that the endpoints of S might be clasps of the same earring, or 
of distinct earrings.) It is easy to see that the segments partition the edge set of G. By deleting a segment, we 
refer to deleting its edges and internal vertices. Observe that if S is deleted from G, the only vertices ofG — S 
that lose an edge are the endpoints of S. A segment is safe if the graph G — S" is 2-connected. 

Arbitrarily pick a vertex o of C as the origin, and consecutively number the vertices of C clockwise around 
the cycle as o = cq, ci, C2, . . . , c,. = 0. The first clasp of an earring H is its lowest numbered clasp, and the last 
clasp is its highest numbered clasp. (If the origin is a clasp of H, it is considered the first clasp, not the last.) 
The arc of an earring is the subgraph of G found by traversing clockwise from its first clasp Cp to its last clasp 
Cq, the length of this arc is g — p. (That is, the length of an arc is the number of edges it contains.) Note that if 
an arc contains the origin, it must be the first vertex of the arc. Figure [T] illustrates several of these definitions. 

Theorem 3.5. Let H be an earring of minimum arc length. Every segment contained in the arc of H is safe. 

Proof. Let Ti be the set of earrings with arc identical to that of H. Since they have the same arc, we refer to 
this as the arc of 7i, or the critical arc. Let the first clasp of every earring in Tihe Ca, and the last clasp of each 
earring in Hhe Ch. Because the earrings in H have arcs of minimum length, any earring H' ^ H has a clasp Cx 
that is not in the critical arc. (That is, < Ca or Cx > Cf,.) 

We must show that every segment contained in the critical arc is safe; recall that a segment S is safe if the 
graph G — S is 2-connected. Given an arbitrary segment S in the critical arc, let Cp and Cq (p < q) be the 

'*If Hi were simply a path, it would be an ear of C, but Hi may be more complex. 
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Figure 2: The various cases of Theorem 13.51 are illustrated in the order presented. In each case, one of the 2 
vertex-disjoint paths from Cp to Cg is indicated with dashed lines, and the other with dotted lines. 

anchors that are its endpoints. We prove that there are always 2 internally vertex-disjoint paths between Cp and 
Cq in G — S; this suffices to show 2-connectivity. 

We consider several cases, depending on the earrings that contain Cp and Cg. Figure |2]illustrates these cases. 
If Cp and Cq are contained in the same earring H', it is easy to find two vertex-disjoint paths between them in 
G — S. The first path is clockwise from q to p in the cycle C. The second path is entirely contained in the 
earring H' (an earring is connected in G — C, so we can always find such a path.) 

Otherwise, Cp and Cg are clasps of distinct earrings. We consider three cases: Both Cp and Cq are clasps of 
earrings in H, one is (but not both), or neither is. 

1. We first consider that both Cp and Cg are clasps of earrings in H. Let Cp be a clasp of Hi, and Cg a clasp 
of H2. The first path is from Cg to Ca through H2, and then clockwise along the critical arc from Ca to Cp. 
The second path is from Cg to Cb clockwise along the critical path, and then ct to Cp through Hi. It is easy 
to see that these paths are internally vertex-disjoint. 

2. Now, suppose neither Cp nor Cq is a clasp of an earring in H. Let Cp be a clasp of Hi, and Cq be a clasp 
of H2. The first path we find follows the critical arc clockwise from Cg to ct (the last clasp of the critical 
arc), from Cf, to Ca through H ^ TC, and again clockwise through the critical arc from Ca to Cp. Internal 
vertices of this path are all in H or on the critical arc. Let Cpi be a clasp of Hi not on the critical arc, and 
Cg' be a last clasp of H2 not on the critical arc. The second path goes from Cp to Cpi through Hi, from 
p' to q' through the cycle C outside the critical arc, and from Cgi to Cg through H2. Internal vertices of 
this path are in Hi,H2, or in C, but not part of the critical arc (since each of Cpi and Cqi are outside the 
critical arc). Therefore, we have 2 vertex-disjoint paths from Cp to Cg. 
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3. Finally, we consider the case that exactly one of clasp of an earring in Ti. Suppose Cp is a clasp 

of Hi G Ti, and Cq is a clasp of H2 ^ Ti; the other case (where Hi ^ H and H2 ^ His symmetric, and 
omitted, though figure [2] illustrates the paths.) Let q' be the index of a clasp of H2 outside the critical arc. 
The first path is from Cq to Ch through the critical arc, and then from Cf, to Cp through Hi. The second 
path is from Cq to Cqi through H2, and from Cq' to Cp clockwise through C. Note that the last part of this 
path enters the critical arc at Ca, and continues through the arc until Cp. Internal vertices of the first path 
that are in C are on the critical arc, but have index greater than q. Internal vertices of the second path that 
belong to C are either not in the critical arc, or have index between Ca and Cp. Therefore, the two paths 
are internally vertex-disjoint. □ 

We now describe our algorithm to find a non-trivial cycle of good density, proving Theorem 1 1.31 Let G be a 
2-connected graph with edge-costs and terminal weights, and at least 2 terminals. There is a polynomial-time 
algorithm to find a non-trivial cycle X in G such that density (X) < density (G). 

Proof of Theorem ll.3t Let G be a graph with £ terminals and density p; we describe a polynomial-time 
algorithm that either finds a cycle in G of density less than p, or a proper subgraph G' of G that contains all I 
terminals. In the latter case, we can recurse on G' until we eventually find a cycle of density at most p. 

We first find, in O(n^) time, a minimum-density cycle C in G. By Theorem 13. 3[ G has density at most p, 
because the minimum-density non-trivial cycle has at most this density. If G contains at least 2 terminals, we 
are done. Otherwise, G contains exactly one terminal v. Since G contains at least 2 terminals, there must exist 
at least one earring of G. 

Let V be the origin of this cycle C, and H an earring of minimum arc length. By Theorem 13. 5[ every 
segment in the arc of H is safe. Let S be such a segment; since v was selected as the origin, v is not an internal 
vertex of S. As v is the only terminal of G, S contains no terminals, and therefore, the graph G' = G — 5 is 
2-connected, and contains all £ terminals of G. □ 

The proof above also shows that if G is minimally 2-connected on its terminals (that is, G has no 2- 
connected proper subgraph containing all its terminals), every cycle of G is non-trivial. (If a cycle contains 
or 1 terminals, it has a safe segment containing no terminals, which can be deleted; this gives a contradiction.) 
Therefore, given a graph that is minimally 2-connected on its terminals, finding a minimum-density non-trivial 
cycle is equivalent to finding a minimum-density cycle, and so can be solved exactly in polynomial time. This 
suggests a natural algorithm for the problem: Given a graph that is not minimally 2-connected on its terminals, 
delete edges and vertices until the graph is minimally 2-connected on the terminals, and then find a minimum- 
density cycle. As shown above, this gives a cycle of density no more than that of the input graph, but this 
may not be the minimum-density cycle of the original graph. For instance, there exist instances where the 
minimum-density cycle uses edges of a safe segment S that might be deleted by this algorithm. 



4 Pruning 2-connected Graphs of Good Density 

In this section, we prove Theorem l2.3[ We are given a graph G and 5 C F, a set of at least k terminals. Further, 
every terminal in G has 2 vertex-disjoint paths to the root r of total cost at most L. Let I be the number of 
terminals in G, and cost{G) its total cost; p = ^"■^^^^^ is the density of G. We describe an algorithm that finds 
a subgraph H of G that contains at least k terminals, each of which is 2-connected to the root, and of total edge 
cost 0{\ogk)pk + 2L. 

We can assume ^ > (8 log k) • k, or the trivial solution of taking the entire graph G suffices. The main phase 
of our algorithm proceeds by maintaining a set of 2-connected subgraphs that we call clusters, and repeatedly 
finding low-density cycles that merge clusters of similar weight to form larger clusters. (The weight of a cluster 
X, denoted by wx, is (roughly) the number of terminals it contains.) Clusters are grouped into tiers by weight; 
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tier i contains clusters with weight at least 2* and less than 2*"*"^. Initially, each terminal is a separate cluster in 
tier 0. We say a cluster is large if it has weight at least k, and small otherwise. The algorithm stops when most 
terminals are in large clusters. 

We now describe the algorithm MergeClusters (see next page). To simplify notation, let a be the 
quantity 2 [log k~\p. We say that a cycle is good if it has density at most q; that is, good cycles have density at 
most 0(log k) times the density of the input graph. 



MergeClusters : 

For (each i in {0, 1, . . . , ( [log2 k] — 1)}) do: 
If (i = 0): 

Every terminal has weight 1 

Else: 

Mark all vertices as non-terminals 
For (each small 2-connected cluster X in tier i) do: 
Add a (dummy) terminal vx to G of weight wx 

Add (dummy) edges of cost from vx to two (arbitrary) distinct vertices of X 
While (G has a non-trivial cycle C of density at most a in G): 

Let Xi, X2, . . . Xqbe the small clusters that contain a terminal or an edge of C. 

(Note that the terminals in C belong to a subset of {Xi, . . . Xq}.) 

Form a new cluster Y (of a higher tier) by merging the clusters Xi, . . . Xg 

WY ^ El=l WX, 

If {i = 0): 

Mark all terminals in Y as non-terminals 

Else: 

Delete all (dummy) terminals in Y and the associated (dummy) edges. 



We briefly remark on some salient features of this algorithm and our analysis before presenting the details 
of the proofs. 

1. In iteration i, the terminals correspond to tier i clusters. Clusters are 2-connected subgraphs of G, and by 
using cycles to merge clusters, we preserve 2-connectivity as the clusters become larger. 

2. When a cycle C is used to merge clusters, all small clusters that contain an edge of C (regardless of 
their tier) are merged to form the new cluster. Therefore, at any stage of the algorithm, all currently small 
clusters are edge-disjoint. Large clusters, on the other hand, are frozen; even if they intersect a good cycle 
C, they are not merged with other clusters on C. Thus, at any time, an edge may be in multiple large 
clusters and up to one small cluster. 

3. In iteration i of MergeClusters, the density of a cycle C is only determined by its cost and the weight 
of terminals in C corresponding to tier i clusters. Though small clusters of other (lower or higher) tiers 
might be merged using C, we do not use their weight to pay for the edges of C. 

4. The ith iteration terminates when no good cycles can be found using the remaining tier i clusters. At 
this point, there may be some terminals remaining that correspond to clusters which are not merged to 
form clusters of higher tiers. However, our choice of a (which defines the density of good cycles) is such 
that we can bound the number of terminals that are "left behind" in this fashion. Therefore, when the 
algorithm terminates, most terminals are in large clusters. 

By bounding the density of large clusters, we can find a solution to the rooted k-2YC problem of bounded 
density. Because we always use cycles of low density to merge clusters, an analysis similar to that of 1221 and 
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ifTOl shows that every large cluster has density at most 0(log^ k)p. We first present this analysis, though it does 
not suffice to prove Theorem |2.3| A more careful analysis shows that there is at least one large cluster of density 
at most 0(log k)p; this allows us to prove the desired theorem. 

We now formally prove that MergeClusters has the desired behavior. First, we present a series of 
claims which, together, show that when the algorithm terminates, most terminals are in large clusters, and all 
clusters are 2-connected. 

Remark 4.1. Throughout the algorithm, the graph G is always 2-connected. The weight of a cluster is at most 
the number of terminals it contains. 

Proof. The only structural changes to G are when new vertices are added as terminals; they are added with 
edges to two distinct vertices of G. This preserves 2-connectivity, as does deleting these terminals with the 
associated edges. 

To see that the second claim is true, observe that if a terminal contributes weight to a cluster, it is contained 
in that cluster. A terminal can be in multiple clusters, but it contributes to the weight of exactly one cluster. □ 

We use the following simple proposition in proofs of 2-connectivity; the proof is straightforward, and hence 
omitted. 

Proposition 4.2. Let Hi = (Vi, Ei) and H2 = (V2, E2) be 2-connected subgraphs of a graph G(y, E) such 
that |Vi n V2I > 2. Then the graph Hi\JH2 = {Vi \JV2,Ei\J E2) is 2-connected. 

Lemma 4.3. The clusters formed by MergeClusters are all 2-connected. 

Proof. Let y be a cluster formed by using a cycle C to merge clusters Xi,X2,. . . Xq. The edges of the 
cycle C form a 2-connected subgraph of G, and we assume that each Xj is 2-connected by induction. Further, 
C contains at least 2 vertices of each X^ so we can use induction and Proposition 14.21 above: We assume 
C U {XiYj^^-^ is 2-connected by induction, and G contains 2 vertices of ^j+i, so (7 U {^ij/^^ 2-connected. 

Note that we have shown Y = C \J {XjY-^^ is 2-connected, but G (and hence Y) might contain dummy 
terminals and the corresponding dummy edges. However, each such terminal with the 2 associated edges is a 
ear of Y; deleting them leaves Y 2-connected. □ 

Lemma 4.4. The total weight of small clusters in tier i that are not merged to form clusters of higher tiers is at 

Proof. Assume this were not true; this means that MergeClusters could find no more cycles of density at 
most a using the remaining small tier i clusters. But the total cost of all the edges is at most cost{G), and 
the sum of terminal weights is at least ^ ^^^^ ^-| ; this implies that the density of the graph (using the remaining 

terminals) is at most 2 [log k] ■ ^°'^*('^) = q,_ But by Theorem l3.3[ the graph must then contain a good non-trivial 
cycle, and so the while loop would not have terminated. □ 

Corollary 4.5. When the algorithm MergeClusters terminates, the total weight of large clusters is at least 
£/2 > {Alogk) ■ k. 

Proof. Each terminal not in a large cluster contributes to the weight of a cluster that was not merged with others 
to form a cluster of a higher tier. The previous lemma shows that the total weight of such clusters in any tier 
is at most 2lEgF]' ^^^^^ there are [log k] tiers, the total number of terminals not in large clusters is less than 

cluster Xj may be a singleton vertex (for instance, if we are in tier 0), but such a vertex does not affect 2-connectivity. 
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So far, we have shown that most terminals reach large clusters, all of which are 2-connected, but we have 
not argued about the density of these clusters. The next lemma says that if we can find a large cluster of good 
density, we can find a solution to the k-2YC problem of good density. 

Lemma 4.6. Let Y be a large cluster formed by MergeClusters. IfY has density at most 5, we can find a 
graph Y' with at least k terminals, each of which is 2-connected to r, of total cost at most 26k + 2L. 

Proof. Let Xi , , . . . be the clusters merged to form Y in order around the cycle C that merged them; each 
Xj was a small cluster, of weight at most k. A simple averaging argument shows that there is a consecutive 
segment of XjS with total weight between k and 2k, such that the cost of the edges of C connecting these 
clusters, together with the costs of the clusters themselves, is at most 26k. Let X„ be the "first" cluster of this 
segment, and Xi, the "last". Let v and w be arbitrary terminals of Xa and Xf, respectively. Connect each of v 
and w to the root r using 2 vertex-disjoint paths; the cost of this step is at most 2L. (We assumed that every 
terminal could be 2-connected to r using disjoint paths of cost at most L.) The graph Y' thus constructed has 
at least k terminals, and total cost at most 26k + 2L. 

We show that every vertex z of Y' is 2-connected to r; this completes our proof. Let z be an arbitrary vertex 
of Y'; suppose there is a cut- vertex x which, when deleted, separates z from r. Both v and w are 2-connected 
to r, and therefore neither is in the same component as z mY' — x. However, we describe 2 vertex-disjoint 
paths Py and in Y' from z to v and w respectively; deleting x cannot separate z from both v and w, which 
gives a contradiction. The paths P^ and Pw are easy to find; let Xj be the cluster containing z. The cycle C 
contains a path from vertex zi G Xj to v' € Xa, and another (vertex-disjoint) path from Z2 G Xj to w' € Xb. 
Concatenating these paths with paths from v' to v in Xa and w' to w in X}, gives us vertex-disjoint paths Pi 
from zi to V and P2 from Z2 to w. Since Xj is 2-connected, we can find vertex-disjoint paths from z to zi and 
Z2, which gives us the desired paths P^ and □ 



We now present the two analyses of density referred to earlier. The key difference between the weaker and 
tighter analysis is in the way we bound edge costs. In the former, each large cluster pays for its edges separately, 
using the fact that all cycles used have density at most a = 0(log k)p. In the latter, we crucially use the fact 
that small clusters which share edges are merged. Roughly speaking, because small clusters are edge-disjoint, 
the average density of small clusters must be comparable to the density of the input graph G. Once an edge is 
in a large cluster, we can no longer use the edge-disjointness argument. We must pay for these edges separately, 
but we can bound this cost. 

First, the following lemma allows us to show that every large cluster has density at most 0(log^ k)p. 

Lemma 4.7. For any cluster Y form^ed by MergeClusters during iteration i, the total cost of edges in Y is 
at most (i + 1) • awy. 

Proof. We prove this lemma by induction on the number of vertices in a cluster. Let S be the set of clusters 
merged using a cycle C to form Y . Let Si be the set of clusters in S of tier i, and ^2 be 5 — 5i. {S2 contains 
clusters of tiers less or greater than i that contained an edge of C.) 

The cost of edges in Y is at most the sum of: the cost of C, the cost of Si, and the cost of ^2. Since all 
clusters in ^2 have been formed during iteration i or earher, and are smaller than Y , we can use induction to 
show that the cost of edges in ^2 is at most {i + \)a 'Y1,X£S2 clusters in Si are of tier i, and so must 

have been formed before iteration i (any cluster formed during iteration i is of a strictly greater tier), so we use 
induction to bound the cost of edges in Si by ia J2xeSi 

^The vertex z may not be in any cluster Xj . In this case, Pv is formed by using edges of C from ztov' £ Xa, and then a path from 
v' lo v; Pw is formed similarly. 
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Finally, because C was a good-density cycle, and only clusters of tier i contribute to calculating the 
density of C, the cost of C is at most a^^xeSi "^^^ Therefore, the total cost of edges in Y is at most 

{i + l)a^xes^x = ii + l)awY. □ 



Let Y be an arbitrary large cluster; since we have only [log k~\ tiers, the previous lemma implies that the 
cost of Y is at most [log A;] • awy = 0(log^ k)pwY- That is, the density of Y is at most 0(log^ k)p, and 
we can use this fact together with Lemma 1431 to find a solution to the rooted k-2YC problem of cost at most 
0(log^ k)pk + 2L. This completes the 'weaker' analysis, but this does not suffice to prove Theorem 12. 3 [ to 
prove the theorem, we would need to use a large cluster Y of density 0(log k)p, instead of 0(log^ k)p. 

For the purpose of the more careful analysis, implicitly construct a forest on the clusters formed by 
MergeClusters. Initially, the vertex set of T is just S, the set of terminals, and T has no edges. Every time 
a cluster Y is formed by merging , X2 , . . . , we add a corresponding vertex Y to the forest J^, and add 
edges from Y to each of Xi, . . . X^; Y is the parent of Xi, . . . Xq. We also associate a cost with each vertex 
in the cost of the vertex Y is the cost of the cycle used to form Y from Xi, . . . Xg. We thus build up trees 
as the algorithm proceeds; the root of any tree corresponds to a cluster that has not yet become part of a bigger 
cluster. The leaves of the trees correspond to vertices of G; they all have cost 0. Also, any large cluster Y 
formed by the algorithm is at the root of its tree; we refer to this tree as Ty. 

For each large cluster Y after MergeC LUSTERS terminates, say that Y is of type i if y was formed 
during iteration i of MergeClusters. We now define the final-stage clusters of Y: They are the clusters formed 
during iteration i that became part of Y. (We include Y itself in the list of final-stage clusters; even though Y 
was formed in iteration i of MergeClusters, it may contain other final-stage clusters. For instance, during 
iteration i, we may merge several tier i clusters to form a cluster X of tier j > i. Then, if we find a good-density 
cycle C that contains an edge of X, X will merge with the other clusters of C.) The penultimate clusters of 
Y are those clusters that exist just before the beginning of iteration i and become a part of Y. Equivalently, 
the penultimate clusters are those formed before iteration i that are the immediate children in Ty of final-stage 
clusters. Figure 1 illustrates the definitions of final-stage and penultimate clusters. Such a tree could be formed 
if, in iteration i — 1, 4 clusters of this tier merged to form D, a cluster of tier i + 1. Subsequently, in iteration 
i, clusters H and J merge to form F. We next find a good cycle containing E and G; F contains an edge of 
this cycle, so these three clusters are merged to form B. Note that the cost of this cycle is paid for the by the 
weights of E and G only; F is a tier i + 1 cluster, and so its weight is not included in the density calculation. 
Finally, we find a good cycle paid for by A and C; since B and D share edges with this cycle, they all merge to 
form the large cluster Y . 



Figure 3: A part of the Tree Ty corresponding to Y, a large cluster of type i. The number in each vertex 
indicates the tier of the corresponding cluster. Only final-stage and penultimate clusters are shown: final-stage 
clusters are indicated with a double circle; all other clusters are penultimate. 

An edge of a large cluster Y is said to be a final edge if it is used in a cycle C that produces a final-stage 
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cluster of Y. All other edges of Y are called penultimate edges; note that any penultimate edge is in some 
penultimate cluster of Y . We define the final cost of Y to be the sum of the costs of its final edges, and its 
penultimate cost to be the sum of the costs of its penultimate edges; clearly, the cost of Y is the sum of its final 
and penultimate costs. We bound the final costs and penultimate costs separately. 

Recall that an edge is a final edge of a large cluster Y if it is used by MergeClusters to form a cycle 
C in the final iteration during which Y is formed. The reason we can bound the cost of final edges is that the 
cost of any such cycle is at most a times the weight of clusters contained in the cycle, and a cluster does not 
contribute to the weight of more than one cycle in an iteration. (This is also the essence of Lemma 14771 ) We 
formalize this intuition in the next lemma. 

Lemma 4.8. The final cost of any large cluster Y is at most awy, where wy is the weight ofY. 

Proof. Let Y be an arbitrary large cluster. In the construction of the tree Ty, we associated with each vertex 
of Ty the cost of the cycle used to form the corresponding cluster. To bound the total final cost of Y, we must 
bound the sum of the costs of vertices of Ty associated with final-stage clusters. The weight of Y, wy is at 
least the sum of the weights of the penultimate tier i clusters that become a part of Y. Therefore, it suffices 
to show that the sum of the costs of vertices of Ty associated with final-stage clusters is at most a times the 
sum of the weights of Y's penultimate tier i clusters. (Note that a tier i cluster must have been formed prior to 
iteration i, and hence it cannot itself be a final-stage cluster.) 

A cycle was used to construct a final-stage cluster X only if its cost was at most a times the sum of weights 
of the penultimate tier i clusters that become a part of X. (Larger clusters may become a part of X, but they do 
not contribute weight to the density calculation.) Therefore, if X is a vertex of Ty corresponding to a final-stage 
cluster, the cost of X is at most a times the sum of the weights of its tier i immediate children in Ty. But Ty 
is a tree, and so no vertex corresponding to an penultimate tier i cluster has more than one parent. That is, the 
weight of a penultimate cluster pays for only one final-stage cluster. Therefore, the sum of the costs of vertices 
associated with final-stage clusters is at most a times the sum of the weights of Y's penultimate tier i clusters, 
and so the final cost of Y is at most awy. □ 

Lemma 4.9. IfYi and Y2 are distinct large clusters of the same type, no edge is a penultimate edge of both Yi 
and Yi. 

Proof. Suppose, by way of contradiction, that some edge e is a penultimate edge of both Yi and Y2, which are 
large clusters of type i. Let Xi (respectively X2) be a penultimate cluster of Yi (resp. I2) containing e. As 
penultimate clusters, both Xi and X2 are formed before iteration i. But until iteration i, neither is part of a 
large cluster, and two small clusters cannot share an edge without being merged. Therefore, Xi and X2 must 
have been merged, so they cannot belong to distinct large clusters, giving the desired contradiction. □ 

Theorem 4.10. After MergeClusters terminates, at least one large cluster has density at most 0(log k)p. 

Proof. We define the penultimate density of a large cluster to be the ratio of its penultimate cost to its weight. 

Consider the total penultimate costs of all large clusters: For any i, each edge e € E{G) can be a penultimate 
edge of at most 1 large cluster of type i. This implies that each edge can be a penultimate edge of at most [log k] 
clusters. Therefore, the sum of penultimate costs of all large clusters is at most [log k\cost{G). Further, the 
total weight of all large clusters is at least i/2. Therefore, the (weighted) average penultimate density of large 
clusters is at most 2 [log k~\ = 2 [log fc] p, and hence there exists a large cluster Y of penultimate density 

at most 2 [log A;] p. 

The penultimate cost of Y is, therefore, at most 2 [log k'\pwy, and from Lemma l4T8l the final cost of Y is 
at most awy. Therefore, the density of Y is at most a + 2 [log k'\p = 0(log k)p. □ 
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Theorem 14 . 1 01 and Lemma 1431 together imply that we can find a solution to the rooted k-2YC problem of 
cost at most 0(log k)pk + 2L. This completes our proof of Theorem 12.3 1 

5 Conclusions 

We list the following open problems: 

• Can the approximation ratio for the k-2YC problem be improved from the current 0(log^logA;) to 
0(log n) or better? Removing the dependence on £ to obtain even 0(log^ A;) could be interesting. If not, 
can one improve the approximation ratio for the easier k-2EC problem? 

• Can we obtain approximation algorithms for the k-XYC or k-XEC problems for A > 2? In general, few 
results are known for problems where vertex-connectivity is required to be greater than 2, but there has 
been more progress with higher edge-connectivity requirements. 

• Given a 2-connected graph of density p with some vertices marked as terminals, we show that it contains 
a non-trivial cycle with density at most p, and give an algorithm to find such a cycle. We have also found 
an O (log -approximation for the problem of finding a minimum-density non-trivial cycle. Is there a 
constant-factor approximation for this problem? Can it be solved exactly in polynomial time? 

Acknowledgments: We thank Mohammad Salavatipour for helpful discussions on A;-2EC and related prob- 
lems. We thank Erin Wolf Chambers for useful suggestions on notation. 
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